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I.        INTRODUCTION 

This  appendix  describes  the  geostatistical  analysis  portion  of  "A  Geostatistical  Study  for 
Geology-Energy-Mineral  Resources  in  the  California  Desert."  A  complete  description  of 
the  study  is  in  the  main  report. 

In  general  terms,  the  geostatistical  analysis  involves  applying  statistical  analytical 
procedures  to  the  reported  occurrence  data,  and  to  the  geological  and  geophysical  data 
described  in  Appendices  A  and  B.  The  result  of  applying  these  procedures  are  statistical 
inferences  regarding  the  likelihood  of  "occurrence"  of  one  or  more  G-E-M  resources 
throughout  the  area  of  the  CDCA.  The  validity  of  these  inferences  is  assessed  by 
estimating  the  likelihood  that  the  areas  are  correctly  classified  in  the  occurrence 
category. 

In  using  the  results  of  the  geostatistical  analysis,  the  following  precautions  should  be 
considered: 


•  Geostatistical  analysis  for  all  of  the  types  of  resources  known  to  occur 
in  the  CDCA  is  not  possible  because  few  occurrences  have  been 
reported  for  some  commodities. 

•  The  occurrence,  geological  and  geophysical  data  are  subject  to  error  as 
discussed  in  Appendices  A  and  B. 

•  All  statements  of  probability  involve  the  likelihood  of  any  one  of  a  set 
of  events  occurring.  Thus,  for  example,  even  if  the  probability  of  not 
drawing  the  two  of  hearts  from  a  deck  of  cards  is  51/52  or  over  98 
percent,  it  is  still  possible  to  draw  the  two  of  hearts.  Likewise,  if  an 
area  is  classified  as  an  occurrence  area  with  a  probability  of  correct 
classification  of  99  percent,  it  is  still  possible  that  no  resource  will  be 
found  there. 

Figure   C-l    is  a   flow   chart   of  the  steps  used  in  the  analysis.     The  steps  shown  are 
discussed  in  the  following  sections. 
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2.        GEOLOGICAL  AND  GEOPHYSICAL  VARIABLES  USED  FOR  GEOSTAT1STICAL 
ANALYSIS 


From  the  original  list  of  41  geological  and  geophysical  variables  (see  Appendix  B),  21 
were  chosen  for  geostatistical  analyses.  Three  criteria  were  used  in  the  selection 
process: 

•  Frequency  of  occurrence. 

•  Correlation  with  other  variables. 

•  Geologic  relevance. 

2. 1  FREQUENCY  OF  OCCURRENCE 

The  following  lithologic  units  were  not  used  for  geostatistical  analysis  because  they 
occur  too  infrequently  to   make  statistical  inferences. 

•  Lithologic    unit    7    (Paleozoic   and   Precambrian   metavolcanic   rocks) 
covers  less  than  0.1  percent  of  the  CDC  A. 

•  Lithologic  unit  8  (Triassic  and  Jurassic  marine  sediments)  covers  less 
than  0.1  percent  of  the  CDCA. 

•  Contact  variables  21   through  31  and  33  are  all  less  than  60  km  in  total 
length  in  the  CDCA. 

Tables  B-l,  B-2  and  B-3  show  the  frequency  of  occurrence  of  all  the  variables. 

2.2  CORRELATION  WITH  OTHER  VARIABLES 

Highly  correlated  variables  introduce  inaccurate  measures  of  either  variable's  statistical 
significance.  In  addition,  use  of  both  of  two  highly  correlated  variables  provides  little 
more  information  than  one  of  the  varibles  for  classification  purposes.  Table  C-l  presents 
the  Pearson  product  moment  correlation  coefficients  (76)  for  variables  not  eliminated  in 
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Table  C-l 

PEARSON  PRODUCT  MOMENT  CORRELATION  COEFFICIENTS 
All  Variables  Untransformed 


V2 

V3 

V4 

V5 

V6 

V9 

VI 0 

Vll 

V12 
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V14 

V15 

VI 6  ■ 
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-0.02 
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0.00 

-0.01 
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-0.01 

-o.nB 

0.00 

-0.04 

-0.03 

-0.03 

-0.02 

0.00 

-0.08 

-0.07 

-0.03 

-0.02 

-0.04 

-0.31 1 

0.04 

0.00 

-0.03 

-0.02 

-0.02 

-0.08 

-0.04 

-0.04 

-0.02 

-0.03 

-0.18. 

08     -0.02     -0.02       0.02     -0.06     -0.04     -0.02     -0.02     -0.04     -0.21 
-0.02       0.04       0.00     -0.02     -0.02     -0.03     -0.01      -0.03-0.10 
-0.01      -0.01       0.01      -0.02     -0.03     -0.01      -0.03     -0.13! 


0.01 

-0.01 

-0.02 

0.00 

-0.01 

-0.02 

-0.07 

0.04 

-0.02 

-0.02 

0.02 
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VI 7 


•0.02 
•0.05 
•0.03 


Table  C-l  (continued) 

PEARSON  PRODUCT  MOMENT  CORRELATION  COEFFICIENTS 

All  Variables  Un trans formed 
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V19 


V20 


V32 


V34 


V35 


V36 


V37 


V38 


V39 


V40 


•0.01 
•0.05 
•0.01 
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•0.01 
•0.02 
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Section  2.1  above.  The  Pearson  product  moment  correlation  coefficient  is  a 
measurement  of  the  correlation  between  2  variables.  If  the  variables  are  x  and  y,  the 
coefficient  is: 


n 

r  — 

^xy   "  n   S    S 

x   y 


Z,     (x.-    x)     (y.-    y) 


C^  is  the  Pearson  product  movement  correlation  coefficient;  where  xi,  X2>  .  .  .,  xn  and 

yi,  y2»  .  .  .  ,  yn  are  the  n  values  of  the  variables  x  and  y,  with  means  x  and  y  and  with 

standard  deviations  Sv  and  Sv. 

x  y 

This  coefficient  ranges  in  value  between  -I  and  I.  It  indicates  the  extent  to  which  two 
variables  vary  with  respect  to  each  other.  If  its  value  is  0,  then  the  variables  are 
perfectly  uncorrelated.  If  it  is  I,  the  variables  both  increase  or  decrease  together.  If  it  is 
-I,  one  variable  decreases  while  the  other  increases.  For  this  study,  correlation 
coefficients  greater  than  or  equal  to  0.90  or  less  than  or  equal  to  -0.90  indicate  very  high 
correlations. 

Variables  35  (length  of  thrust  faults)  and  38  (number  of  non-thrust  faults)  were 
eliminated  because  they  are  highly  correlated  with  variables  36  (number  of  thrust  faults) 
and  37  (length  of  non-thrust  faults)  respectively.  Variables  36  and  37  were  kept  instead 
of  35  and  38  to  keep  a  variety  of  types  of  frequency  distributions  (number  and  length)  in 
the  analysis. 

2.3      GEOLOG IC  RELEVANCE 

Variables  12,  16,  17  (Quaternary  lithologic  units)  and  18  (water  and  unmapped  areas)  were 
eliminated  because  they  are  not  correlated  with  mineral  occurrences  and  tend  to  mask 
the  relevant  geological  variables.  Any  statistical  relationships  that  might  appear 
between  any  of  these  variables  and  mineral  occurrences  would  be  coincidence  and 
misleading  if  used  for  predictive  purposes. 
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3.        REGRESSION  ANALYSIS 

Regression  analysis  was  attempted  with  talc.  Talc  was  selected  because  of  the  high 
proportion  of  deposits  for  which  quantifiable  production  data  are  available.  Of  74  talc 
occurrences  in  the  CDCA,  tonnage  production  figures  are  available  for  52.  The  best 
quantitative  measure  for  a  particular  mineral  occurrence,  such  as  talc,  would  be  the 
total  magnitude  of  the  deposit  (reserves  plus  cumulative  production).  However,  reserve 
estimates  for  talc  (as  well  as  all  other  minerals)  are  not  available. 

The  tonnage  data  for  the  52  deposits  were  regressed  against  the  21  geologic  variables, 
but  the  results  were  discouraging.  The  "goodness  of  fit"  between  talc  production  in  tons, 
and  all  the  geologic  variables  combined  is  only  32.7  percent.  This  means  that  variation  in 
talc  tonnage  is  about  one- third  "explained"  by  variations  in  the  regional  geology,  and  that 
about  two-thirds  must  be  "explained"  by  other  factors.  The  other  factors  probably 
include  location,  road  access,  and  quality  (i.e.,  level  of  impurities)  and  other  factors 
which  have  a  large  influence  on  talc  production. 

Conventional  regression  techniques  are  not  justified  with  the  other  commodities  because 
so  many  of  the  occurrences  consist  of  information  that  the  commodity  is  present,  but  no 
production  number  is  available.  However,  an  attempt  was  made  to  find  a  regression 
relationship  between  production  categories  (see  Section  6)  and  the  geologic  variables. 
This  is  not  a  conventionally  accepted  technique,  but  was  applied  to  the  three  categories: 

1.  Gold, 

2.  Combined  copper,  lead,  silver  and  zinc, 
"  3.        Combined  iron  and  manganese. 

The  results  were  even  more  discouraging.  All  correlations  were  less  than  15  percent. 
Consequently,  no  further  regressions  were  conducted. 


4.        CLUSTER  ANALYSES 

As  originally  conceived,  cluster  analysis  was  to  be  used  in  the  study.  As  the  study 
proceeded,  we  decided  that  cluster  analysis  was  not  a  useful  technique  in  this  case. 
Cluster  analysis  attempts  to  provide  a  hierarchical  classification  of  "objects"  in  which 
the  linkages  which  bind  the  hierarchy  of  clusters  together  indicate  the  relative  degree  of 
similarity  between  the  clusters  and,  in  turn,  between  the  objects  being  classified.  The 
objects  which  would  have  been  classified  are  the  6,850  geographic  cells  (4  km  x  4  km).  If 
these  individual  cells  were  analyzed  by  cluster  analysis,  the  cluster  diagram  would  be 
immensely  complex.  The  classification  thus  resulting  would  involve  computation  of  a 
"similarity  matrix"  based  on  a  comparison  of  the  geologic  variables.  The  similarity 
matrix  could  consist  of  conventional  correlation  coefficients.  Each  coefficient  would 
represent  the  comparison  between  each  cell  and  every  other  cell.  Thus,  there  are 
(n  -  n)/2  such  possible  pairs  of  comparison  (n  =  6,850)  in  the  CDCA  which  yields  a 
similarity  matrix  (one  of  its  two  triangular  halves)  containing  23,457,825  coefficients. 
Such  a  matrix  is  absolutely  unmanageable.  It  could  be  programmed  for  calculation  on  a 
large  computer,  but  the  program  execution  time  to  run  a  cluster  analysis  on  such  a 
matrix  would  probably  be  measured  in  days,  even  on  a  large,  extremely  fast  computer 
such  as  the  IBM  370/168  used  in  this  study. 

Other  problems  would  arise  if  cluster  analysis  were  to  be  used.  As  its  name  implies, 
cluster  analysis  involves  the  "clustering"  of  the  objects  being  classified.  As  applied  to 
this  study,  the  objects  would  be  the  geographic  cells.  There  is  no  suitable  way  of 
representing  the  clustering  relationships  on  maps.  The  cluster  diagram  involves  a 
hierarchical  arrangement  of  linkage  which  can  only  be  effectively  viewed  as  the  "cluster 
diagram"  itself. 
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5.        DISCRIMINANT  FUNCTION  ANALYSIS 

5.1      PRINCIPLES 

Discriminant  function  analysis  (DFA)  is  the  statistical  technique  used  in  this  study  to 
classify  k  km  x  4  km  cells  according  to  the  likelihood  of  occurrence  of  mineral  resources. 
This  method  was  used  to  discriminate  between  either  two  or  three  categories  of  mineral 
occurrence  on  the  basis  of  a  number  of  geological  parameters  and  one  geophysical 
parameter.  By  determining  relationships  between  these  parameters  and  mineral 
occurrences  in  a  subset  of  cells  of  the  entire  area  (called  the  "training  set"),  these 
relationships  were  applied  to  the  entire  area. 

The  method  begins  with  the  assumption  that  some  set  (xi,  .  .  .  x^)  of  variables,  called 
"discriminators",  can  be  chosen  whose  values  are  closely  related  to  membership 
properties  of  the  cell  populations  in  question.  In  our  case,  the  discriminators  are  the 
areal  percentage  of  various  lithologic  units  (Table  B-l),  length  of  faults,  length  of 
contacts  between  lithologic  units,  fault  curvature  and  Bouguer  gravity.  (See  Appendix  B 
for  a  discussion  of  these  features.)  Using  the  values  of  the  discriminators  and  the 
membership  properties  of  the  training  cells,  the  technique  yields  a  linear  function  of  the 
form 

Y  (xj, .  .  .tx^)  =  ajX|  +  . .  .  +  a^  (C-l) 

where  the  a-  are  chosen  so  that  the  means  Y:  of  Y  over  the  populations  in  the  training  set 
are  maximally  separated  relative  to  the  variation  of  Y  within  the  populations  (76).  For 
example,  if  there  are  two  populations  whose  sample  sizes  (number  of  training  cells)  are 
ni  and  n^,  then  the  separation  between  the  means  is  measured  by  (Yj  -  Y2)  •  The 
variation  of  Y  in  the  two  popluations  is       2       n . 

2      rf        (Y..-Y)2 
1=1      j=l  1J 

where  Y«  denotes  the  value  of  Y  for  the  j-th  individual  in  population  i.   The  discriminant 

function    is   then   the    linear  combination   of  the   form   of  equation  (C-l)  above  which 

maximizes 

(Yi-Y2)2 
2      n. 


1-1      J-1      (Yij-Y)2 
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When  Y  is  thus  chosen,  the  means  Y|  and  Y2  are  computed  for  the  sample  population, 
and  some  intermediate  value  is  chosen  as  the  point  best  discriminating  population  I  from 
population  2.  Sometimes  a  clear  choice  of  this  intermediate  value  is  not  possible,  and 
the  midpoint  (Yi  +  Y2V2  is  chosen  for  convenience. 

The  Statistical  Package  for  the  Social  Sciences  (SPSS)  (77),  developed  at  the  University 
of  Chicago  and  available  at  the  Stanford  Center  for  Information  Processing,  contains  the 
program  DISCRIMINANT  which  was  used  in  our  analysis.  DISCRIMINANT  has 
capabilities  to  perform  discriminant  analysis  on  a  number  of  populations  and 
discriminators.  In  our  study,  two-  and  three-population  category  analyses  suffice;  these 
require  the  computation  of  one  and  two  discriminant  functions,  respectively.  (The 
number  of  discriminant  functions  required  is  the  lesser  of  the  number  of  discriminators, 
or  the  number  of  populations,  minus  one)  (77). 

In  the  present  application,  DISCRIMINANT  selects  descriminators  to  be  used  one  at  a 
time  to  maximize  IY|  -  Y2I  (absolute  value),  called  the  Mahalanobis  distance,  D  ,  until 
inclusion  of  further  discriminators  adds  negligible  separation.  Once  the  discriminant 
function  has  been  determined,  DISCRIMINANT  is  used  to  classify  each  4  km  by  4  km  cell 
into  its  predicted  occurrence  category.  Each  cell  in  the  population  is  classified, 
including  the  training  cells  whose  correct  classification  is  known.  The  percentage  of 
training  cells  which  are  correctly  classified  by  DISCRIMINANT  is  one  measure  of  the 
capability  of  the  discriminant  function  to  differentiate  among  categories. 

Several  obvious  limitations  of  discriminant  function  analysis  should  be  noted.  First,  it  is 
a  technique  for  determining  the  most  probable  population  distribution  of  a  collection  of 
entities  (in  this  case  4  km  x  4  km  cells)  whose  membership  properties  are  not  known  a 
priori.  The  results  of  the  analysis  should,  therefore,  be  viewed  as  "best  guesses"  of 
population  membership.  Second,  incomplete  knowledge  exists  concerning  the  nature  of 
the  relationship  between  the  discriminators  and  population  membership,  and  the  choice 
of  discriminators  is  restricted  by  the  availability  of  conveniently  partitioned  data. 
Hence  the  choice  of  discriminators  in  our  application  requires  tradeoffs  between 
judgments  based  on  geologic  theory  and  the  limitations  imposed  by  the  accessibility  of 
information.  The  reliability  of  the  discriminant  function  analysis  performed  in  this  study 
and  the  methods  for  evaluating  it  are  discussed  in  later  sections  of  this  appendix. 
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5.2      SELECTION  OF  TRAINING  DATA  AND  COMMODITY  CATEGORIES 

The  "training  set"  was  selected  to  provide  the  discriminant  function  coefficients  which 
were  used  to  classify  the  remainder  of  the  CDC  A.  About  10  percent  of  the  area  of  the 
CDCA  was  selected  to  provide  the  training  set.  The  training  set  consists  of  information 
derived  from  the  following  areas: 

Southwest  quarter  of  UTM  block  MK 
Northeast  quarter  of  UTM  block  NK 
Southwest  quarter  of  UTM  block  PK 
Northeast  quarter  of  UTM  block  PG 

These  areas  were  selected  because  they  provide  both  geologic  and  geographic  diversity 
within  the  CDCA,  as  well  as  a  reasonable  representation  of  the  mineral  commodities 
present  in  the  CDCA. 

The  "training  set"  employed  here  consists  of  those  cells  within  the  UTM  blocks  as 
described  above.  These  cells  are  used  to  "train"  the  discriminant  function  analysis  tool 
(i.e.,  these  cells  provide  the  information  used  to  calculate  the  coefficients  of  the  various 
discriminant  functions  employed).  Thus,  the  concept  is  that  of  using  part  of  the  area  to 
"train"  the  procedure  employed  (DFA)  to  analyze  the  remainder  of  the  area  (the  "target 
set"). 

The  mineral  commodities  were  aggregated  initially  into  the  following  six  categories  for 
the  preliminary  DFA  analyses: 

1.  Lead,  zinc,  copper  and  silver  combined. 

2.  Gold. 

3.  Iron  and  maganese  combined. 

4.  Copper. 

5.  Lead,  zinc,  copper,  silver  and  gold  combined. 

6.  Tungsten. 
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The  rationale  for  the  selection  of  these  commodity  categories  is  as  follows. 


L  Lead,  zinc,  copper  and  silver  occur  together  in  many  of  the  ore 
deposits  in  the  CDCA  forming  the  familiar  "base  metal"  or 
"hydrothermal"  type  deposits.  Therefore,  it  is  logical  that  they  be 
statistically  aggregated.  Where  production  data  are  available,  the 
aggregation  involves  transformation  of  the  production  statistics  to 
dollar  equivalents,  as  discussed  in  Appendix  B. 

2.  Gold  deposits  in  the  CDCA  are  sufficiently  numerous  that  segregation 
of  gold  as  a  category  by  itself  was  deemed  desirable. 

3.  Iron  and  manganese  frequently  occur  in  similar  geologic  environments. 
However,  these  two  commodities  rarely  occur  in  association  with  the 
base  metals  and  thus,  their  segregation  from  the  base  metals  and  from 
gold  is  reasonable. 

4.  Copper  occurrences  in  the  CDCA  are  sufficiently  numerous  to  justify 
copper  as  a  separate  category.  Thus,  copper  was  included  in 
hydrothermal  categories  and  considered  separately  as  well. 

So  Lead,  zinc,  copper,  silver  plus  gold  often  occur  together,  forming  a 
suite  of  hydrothermal  deposits.  Thus,  the  four  base  metals  plus  gold 
were  run  qs  a  combined  group. 

6.  Tungsten  occurs  in  various  geologic  environments.  However, 
occurrences  in  no  geologic  environment  alone  are  frequent  enough  to 
support  a  statistical  analysis.  They  were,  therefore,  combined  for 
DFA. 


No  statistical  manipulations  were  undertaken  for  non-metallic  deposits,  except  for  the 
regression-analysis  experiment  with  talc.  The  rationale  for  excluding  the  non-metallics 
(including  salines)  is  that,  generally  speaking,  there  is  little  to  tie  their  occurrences  to 
details  of  the  regional  geology  as  represented  on  maps.  Moreover,  the  number  of  non- 
metallic  occurrences  of  any  particular  non-metallic  commodity  category  is  not  sufficient 
to  justify  the  development  of  a  statistical  relationship. 

Evaporites  are  obviously  strongly  correlated  with  Quaternary  deposits,  but  it  would  be 
misleading  to  produce  a  statistical  forecast  of  the  occurrences  of  evaporite  commodities 
merely  on  the  basis  of  the  presence  of  Quaternary  deposits.  A  similar  argument  may  be 
applied  to  sand  and  gravel  deposits.  Furthermore,  the  value  of  sand  and  gravel  deposits 
is  very  sensitive  to  the  proximity  to  sites  of  consumption  and  to  roads  and  railroads. 
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Commercial  limestone  quarries,  like  sand  and  gravel,  are  sensitive  to  location.  Thus,  no 
attempt  has  been  made  to  prepare  statistical  forecasts  of  the  potential  of  commercial 
limestone  deposits. 

DFA  techniques  were  not  applied  to  talc.  The  reasons  for  this  are  principally  geological. 
Talc,  being  a  metamorphic  mineral,  tends  to  be  associated  with  metamorphic  terrains. 
For  regional  appraisal  purposes,  it  would  probably  be  more  effective  to  label  those  map 
units  that  contain  metamorphics  as  having  talc  potential  than  it  would  be  to  attempt  to 
make  a  statistically  based  forecast. 

The  remaining  non-metal  lies;  including  oil  and  natural  gas,  carbon  dioxide,  and 
geothermal  fluids;  are  equally  unsuited  for  statistical  prediction,  based  on  information 
available  from  geologic  maps.  These  are  primarily  subsurface  phenomena  and 
relationships  between  them  and  the  features  displayed  on  the  geologic  map  are  tenuous 
at  best.  In  addition,  they  have  not  been  developed  in  the  CDCA  in  commercial  forms 
(except  some  COn),  so  there  is  no  production  base  upon  which  to  perform  statistical 
analyses. 

5.3      SELECTION  OF  PRODUCTION  CATEGORIES 

"Production  categories"  for  DFA  analysis  were  established  as  described  in  Section  5.1. 
DFA  was  performed  on  each  commodity  category  by  initially  using  only  two  production 
categories,  namely  "absence,"  which  may  be  best  defined  as  "no  known  or  reported 
occurrence",  and  "occurrence".  No  distinction  between  magnitudes  of  different 
occurrence  categories  was  made.  DFA  was  conducted  using  three  production  categories 
on  combined  copper,  zinc,  lead  and  silver;  on  gold;  on  copper,  zinc,  lead,  silver  and  gold 
combined;  and  on  tungsten;  which  collectively  have  enough  occurrences  to  support  such 
an  application.  When  three  production  categories  are  employed,  they  are  defined  as  I, 
"no  known  or  reported  occurrence";  2,  "occurrence  but  production  (if  any)  aggregating 
less  than  $50,000";  and  3,  "aggregate  production  of  $50,000  or  more". 

These  two  or  three-category  groups  seem  appropriate  given  the  information  available  and 
the  degree  of  success  in  applying  DFA. 
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5.4      VARIABLE  TRANSFORMATIONS 

To  assess  the  validity  or  confidence  of  the  statistical  results,  it  is  desirable  to  have 
variables  that  are  normally  distributed  and  statistically  independent  of  each  other.  A 
normal  distribution  has  skewness  and  kurtosis  of  zero  (77).  Skewness  is  the  third  moment 
about  the  mean,  X  of  a  distribution  of  variables  (Xj).  Skewness  is  a  measure  of  the 
symmetry  of  the  dispersion  of  variables  about  the  mean.  A  symetrical  distribution  has  a 
skewness  of  zero.   Skewness  is  defined  as  follows: 


N 

(X. 

l 

3N 


E     (x,-x)3 

Skewness  =  — and 


where  N  is  the  number  of  variables. 

Kurtosis  is  the  fourth  moment  about  the  mean.   Kurtosis  is  a  measure  of  the  "flatness"  of 

the   distribution.      A   positive   kurtosis   indicates   a   "peaked"   distribution.     Kurtosis  is 

defined  as  follows: 

N 


£    (xrx) 

S'N 


Kurtosis  =  — 3,  where 


The  variance,  S  ,  is  the  second  moment  about  the  mean.  Variance  is  a  measure  of  the 
dispersion  of  the  variables  about  the  mean.  The  larger  the  variance,  the  more  dispersed 
the  data.   Variance,  the  square  of  the  standard  deviation,  is  defined  as  follows: 


N 


2  .   1-1 


E     (xrx)2 


N-1 


Analysis  of  the  skewness  and  kurtosis  of  the  geological  and  geophysical  variables  shows 
that  the  frequency  distribution  of  the  variables  is  similar  to  those  encountered  in  a 
previous  geostatistical  study  in  Norway  by  Prelat  (80).  The  transformations  applied  by 
Prelat  were  applied  to  the  CDCA  data  and  are  shown  in  Table  C-3.  Table  C-2  shows  the 
statistics  of  the  untransformed  and  transformed  variable  distributions.  Comparison  of 
the    skewness    and    kurtosis    values    shows    an    improvement    towards    normality    after 
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Table  C-2 

BASIC  STATISTICS  OF  GEOLOGIC  VARIABLES 

USED  IN  GEOLOGIC  CALCULATIONS* 


Untransformed  Units 

Transformed  Units 

Variable 

Standard 

Standard 

Number 

Mean 

Deviation 

Skewness 

Kurtosis 

Mean 

Deviation 

Skewness 

Kurtosis 

1 

.007 

.052 

11.313 

153.098 

.090 

.171 

8.411 

85.634 

2 

.051 

.100 

3.819 

14.947 

.272 

.478 

3.207 

10.687 

3 

.018 

.101 

6.923 

51.805 

.127 

.307 

6.039 

40.880 

4 

.021 

.103 

6.415 

45.645 

.142 

.320 

5.354 

33.116 

5 

.005 

.041 

12.587 

187.238 

.083 

.138 

9.196 

99.024 

6 

.012 

.071 

8.160 

76.914 

.113 

.231 

6.147 

44.448 

9 

.004 

.040 

12.449 

179.922 

.082 

.134 

9.310 

99.670 

10 

.003 

.026 

19.581 

488.340 

.077 

.099 

10.90 

163.924 

II 

.136 

.250 

1.956 

2.758 

.513 

.717 

1.629 

1.772 

13 

.026 

.102 

5.282 

31.342 

.166 

.327 

3.991 

17.797 

14 

.005 

.032 

12.149 

195.492 

.089 

.130 

7.061 

63.920 

15 

.048 

.144 

3.888 

15.878 

.241 

.439 

3.025 

9.550 

19 

.176 

1.804 

15.388 

311.418 

1.036 

.319 

10.700 

132.100 

20 

.206 

1.610 

11.697 

185.994 

1.048 

.328 

8.160 

76.840 

32 

.076 

.970 

19.392 

461.045 

1.018 

.198 

13.800 

225.126 

34 

.030 

.564 

28.854 

1011.024 

1.008 

.121 

20.520 

504.550 

36 

.061 

.490 

12.649 

225.852 

1.020 

.143 

9.180 

102.610 

37 

5.440 

8.386 

2.032 

4.695 

2.100 

.017 

1.080 

.230 

i 

39 

.276 

1.000 

6.433 

59.328 

1.092 

.287 

4.222 

22.560 

i 

40 

1.725 

7.365 

8.519 

80.582 

1.390 

.891 

5.670 

39.046 

41 

838.163 

231.247 

-3.251 

8.927 

6.503 

1.141 

-3.39 

9.58 

*  See  Appendix  B  for  definitions  of  variables. 
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Table  C-3 
VARIABLE  TRANSFORMATIONS 


Transformation 

Variables  To  Which  Applied  (Xj) 

2  arcsin    X(-  +  .001 

Map  units  (1-18) 

xi+ 1.0 

Contacts  and  faults  (19-40) 

log,0(Xi+  10.0) 

— . 

Bouguer  gravity  (41) 
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transformation.  However,  after  transformation,  the  transformed  distributions  are  still 
not  normal. 

5.5  CASES  CONSIDERED 

As  discussed  in  Section  5.2,  sufficient  occurrence  data  do  not  exist  to  perform 
discriminant  analysis  on  all  commodities  in  the  CDCA.  Some  commodities  with 
sufficient  data  to  be  considered  alone  were  grouped  together  because  they  are  often 
found  in  similar  geologic  environments.  The  commodities  and  discriminant  analysis  runs 
are  described  in  Table  C-4. 

5.6  STATISTICAL  MEASUREMENTS  OF  ERROR  AND  SIGNIFICANCE 

Any  probabalistic  statement  contains  some  uncertainty.  As  applied  here  that  means  the 
DFA  predictions,  which  are  probability  statements  of  occurrence  of  certain  minerals, 
will  have  a  statistical  error  associated  with  them.  This  error  is  a  measure  of  how  well 
the  model  compares  with  reality  and  how  accurately  the  variables  selected  by  the  model 
are  measured.  It  means  that  some  of  the  predictions  are  inaccurate  and  cells  are  mis- 
classified.  This  does  not  indicate  a  mistake  in  the  model,  only  a  lack  of  perfect 
information. 

There  is  no  way  to  measure  the  size  of  the  statistical  error  other  than  to  field  test  every 
cell  in  the  CDCA  and  compare  the  results  with  the  DFA  predictions.  However,  there  are 
some  tests  that  can  be  performed  to  give  an  idea  of  the  magnitude  of  the  error  that  can 
be  expected,  and  they  are  discussed  below. 

5.6. 1  F-Test  for  Significance  of  Separation  Between  Means  of  Categories 

In  using  the  results  of  discriminant  function  analyses,  it  is  important  to  ascertain 
whether  the  discriminators  are  adequate  to  distinguish  among  the  populations  under 
consideration.  In  the  cases  studied  here,  the  populations  are  sets  of  4  km  by  4  km  cells 
where  certain  G-E-M  resources  have  or  have  not  been  reported,  and  the  discriminators 
are  the  geological  variables  obtained  from  the  Geologic  Map  of  California,  plus  Bouguer 
gravity.  A  set  of  discriminators  generally  will  be  adequate  for  a  particular  commodity 
provided  they  are  sufficiently  related  to  the  occurrence  of  that  mineral  to  exhibit 
regular  and  noticable  variations  between  cells  of  the  '•occurrence"  and  "non-occurrence" 
categories.    This  proviso  actually  reduces  to  two  requirements.    First,  the  means  of  the 
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Table  C-4 

CASES  CONSIDERED 

DISCRIMINANT  ANALYSIS(a) 


Name  of  Run 

Brief  Description 

Two-Category  Runs 

GOLD  NO  TRANSF. 

High  and  low  probability  of  occurrence  of  gold.   Unlike 
other  runs,  this  run  does  not  use  transformations  on  the 
geologic  variables  to  make  them  more  nearly  normal. 

GOLD  MINRESID 

High   and    low  probability  of  occurrence  of  gold,  with 
discriminant  function  found  by  minimizing  (1  +  D^/4)     • 

GOLD 

High  and  low  probability  of  occurrence  of  gold. 

HYDRO  W/GOLD(b) 

High  and   low  probability  of  occurrence  of  any  of  the 
following:     gold,     lead,     silver,     zinc,     copper,     either 
individually  or  in  any  combination. 

HYDRO  W/O  GOLD(b) 

High  and   low  probability  of  occurrence  of  any  of  the 
following:     lead,  silver,  zinc,  copper,  either  individually 
or  in  any  combination. 

IRON-MANGANESE 

High  and  low  probability  of  occurrence  of  either  iron  or 

manganese,  or  both. 

COPPER 

High  and  low  probability  of  occurrence  of  copper. 

TUNGSTEN 

High  and  low  probability  of  occurrence  of  tungsten. 

HYDWOGP 

Production    or    no    known    production    of    any    of    the 
following:    lead,  silver,  zinc,  copper. 

Three-Category  Runs 

HYDRO  W/GOLD(b) 

Occurrence  valued  at  under  $50,000,  occurrence  valued 
at  over  $50,000,  or  low  probability  of  occurrence  of  any 
of  the  following:    gold,  lead,  silver,  zinc,  copper. 

HYDRO  W/O  GOLD(b) 

Occurrence  valued  at  under  $50,000,  occurrence  valued 
at  over  $50,000,  or  low  probability  of  occurrence  of  any 
of  the  following:    lead,  silver,  zinc,  copper. 

GOLD 

Occurrences  valued  at  under  $50,000,  occurrence  valued 
at   over   $50,000,    or    low  probability  of  occurrence  of 
gold. 

TUNGSTEN 

— — — . —                                 .  ,. , 

Occurrence  valued  at  under  $50,000,  occurrence  valued 
at   over   $50,000,    or    low  probability  of  occurrence  of 
tungsten. 

(a)     Unless  otherwise   indicc 
Mahalanobis  distance. 

ited,   the  discriminant  function   is  found  by  maximizing  the 

(b)    The  name  "HYDRO"  ha: 
deposits   commonly   referrec 
deposits  of  copper,  lead,  zinc 

>  been  used  informally  here  to  designate  those  classes  of  ore 
i   to   as   "hydrothermal",    and    in    this    case,    specifically,   to 
:  and  silver  occurring  individually  or  in  any  combination. 
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discriminators  must  be  closely  related  to  the  commodity's  occurrence.  Second,  the 
discriminators  must  not  exhibit  large  variances  which  would  cause  the  populations,  when 
identified  solely  by  the  values  of  the  discriminators,  to  overlap  significantly. 

One  test  of  the  first  requirement  is  to  measure  the  significance  of  the  difference  beween 
the  means  of  the  discriminant  function  Y(xi,  .  .  ^x^)  taken  over  the  sample  populations. 
To  do  this,  we  test  the  hypothesis  that  the  Mahalanobis  distance,  D*  =  IYi  -  Y2I,  is 
nonzero,  i.e.,  that  the  set  of  discriminators  is  adequate  to  define  a  function  Y  for  which 
D  is  larger  than  could  reasonably 
independently  of  the  discriminators. 


D     is  larger  than  could  reasonably  be  expected  were  the  entire  populations  distributed 


This  test,  called  an  F  test,  calculates  the  probability  that  the  discriminant  function  Y  has 
the  same  mean  for  population  I  as  for  population  2.  A  high  probability  of  having  the 
same  mean  indicates  that  the  discriminators  are  not  adequate  to  differentiate  between 
the  populations. 

The  test  is  as  follows.  Assume  that  there  are  n|  training  cells  known  to  be  in  population 
I  and  n2  in  population  2  and  that  there  are  k  discriminators.  Kendall  (1966)  shows  that 
the  statistic 

n.n-   (n.    +  n-  -  k  -    I)        _ 
po  .        I    ^        I  £ q2 

(nj   +  n2)(n|    +  n2  -  2)k 

has  very  nearly  the  same  distribution  as  the  commonly  tabulated  function  F 
(with  k  and  n  1  -  n2  -  k  -  I  degrees  of  freedom).  Therefore,  the  probability  that  Y  has  the 
same  mean  for  population  I  as  for  population  2  (given  the  difference  D  calculated  for 
the  sample  cells)  is  the  probability  that  F  is  greater  than  F_. 

This  test  does  not  indicate  how  reliably  Y  determines  to  which  population  a  given  cell 
belongs.  Even  with  high  confidence  that  Y  has  different  means  for  different  populations, 
there  may  be  substantial  overlap  between  the  populations.  In  this  case,  the  second 
requirement  mentioned  above  is  violated.  Tests  for  this  violation  are  discussed  in 
Sections  5.6.2,  5.6.3  and  5.6.4  below. 
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Table  C-5  lists  the  results  of  this  F-test  for  the  discriminant  analyses  performed  on  the 
two-category  sets  of  cells.  The  two  categories  are  as  follows  (except  in  HYDV/OGP, 
which  will  be  explained  later): 

1.  Cells  where  the  specified  resources  occur. 

2.  Cells  where  the  specified  resources  do  not  occur. 

In  every  case,  the  assumption  of  different  values  of  Y  for  different  populations  is  reliable 
to  at  least  the  99  percent  confidence  level  (i.e.,  the  probability  that  the  means  are  the 
same  is  less  than  I  percent). 

Table  C-6  lists  the  results  of  the  F  test  for  the  three-category  discriminant  analyses. 
The  three  categories  are: 

1.  Cells  where  the  specified  resources  occur  in  quantities  valued  at  up  to 
$50,000. 

2.  Cells  where  the  specified  resources  occur  in  quantities  valued  at  more 
than  $50,000. 

3.  Cells  where  the  specified  resources  do  not  occur. 

Three-population  discriminant  analyses  yield  three  F-tests:  one  between  each 
(unordered)  pair  of  populations.  Thus,  three  rows  appear  in  Table  C-6  for  each  run.  For 
none  of  the  commodity  categories  are  all  three  F-tests  significant  at  the  5  percent 
confidence  level.  F-tests  for  populations  I  and  3  are  uniformly  significant  beyond  the  I 
percent  level,  and  in  two  cases  (HYDRO  W/GOLD  and  GOLD  alone)  the  F-tests  for 
populations  2  and  3  are  significant  at  the  5  percent  level.  These  results  indicate  that, 
although  the  set  of  discriminators  chosen  is  useful  in  distinguishing  between  high  and  low 
probabilities  of  occurrence,  it  is  less  useful  in  distinguishing  between  various  commercial 
levels  of  occurrence. 

All  of  the  approaches  for  testing  the  second  requirement  for  adequacy  of  the 
discriminators  involve  error-rate  estimation.  The  three  error-rate  estimation  techniques 
used  for  ths  project  are  (I),  the  apparent  error  rate;  (2),  the  Lachenbruch  and  Mickey 
method;  and  (3),  the  i-th  partial  apparent  error  rate.  More  sophisticated  approaches  to 
error-rate  estimation  are  based  on  the  assumption  that  the  frequency  distributions  of  the 
discriminators    in    each    population    are    multivariate    normal    with    known   distribution 
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Table  C-5 

TEST  OF  SIGNIFICANCE  OF  SEPARATION  BETWEEN  THE  MEANS 

IN  TWO-CATEGORY  DISCRIMINATION(a)'(b) 


(a)  The  F-test  has  been  used  as  discussed  on  the  previous  page.  When  F°  exceeds  F  m-, 
the  separation  between  the  means  is  significant  at  the  95  percent  confidence  level. 
When  F°  exceeds  Fqj ,  the  separation  is  significant  at  the  99  percent  level.  As 
shown  in  the  table,  the  separation  between  the  means  is  significant  at  the  99 
percent  confidence  level  in  all  cases. 

(b)  Except  for  case  GOLD  MINRESID  discriminators  where  chosen  by  maximizing  D2. 

(c)  Here,  discriminators  were  chosen  by  minimizing  (I  +  DV4)"'. 

(d)  In  this  case,  the  discriminators  were  not  transformed  to  approach  normality. 


Name  of  Run 

nl 

n2 

k 

D2 

F° 

F.05 

F.0I 

GOLD  MINRESID(c) 

40 

572 

10 

1.39 

5.12 

1.85 

j 

2.36 

GOLD  NO  TRANSF(d) 

40 

572 

10 

1.41 

5.19 

1.85 

2.36 

GOLD 

40 

572 

10 

1.39 

5.12 

1.85 

2.36 

Gold,  Lead,  Silver,  Zinc,  Copper 
(HYDRO  W/GOLD) 

86 

526 

13 

1.26 

7.02 

1.74 

2.17 

i 

Lead,  Silver,  Zinc,  Copper 
(HYDRO  W/O  GOLD) 

56 

556 

16 

1.58 

4.90 

1.37 

1.54 

i 
i 

1 

HYDWOGP 

25 

587 

13 

2.21 

4.00 

1.74 

l 

2.17 

i 

IRON-MANGANESE 

13 

599 

8 

2.15 

3.37 

1.69 

2.07 

COPPER 

25 

587 

10 

1.95 

4.61 

1.85 

i 
2.36 

TUNGSTEN 

21 

591 

8 

2.07 

5.19 

1.69 

2.07 
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Table  C-6 
RESULTS  OF  F-TEST  FOR  THREE-CATEGORY  DISCRIMlNATION(a>'(b> 


Categories 

Compared 

nl 

n2 

n3 

k 

D2 

F° 

Confidence  Level 

Name  of  Run 

.05 

.01 

Lead,  Silver,  Zinc, 

1.  2 

52 

4 

18 

5.00 

0.71 

1.83 

2.38 

Copper 
(HYDRO  W/O  GOLD) 

1,  3 

2,  3 

52 

4 

556 
556 

18 
18 

1.55 

5.24 

3.98 
1.12 

1.58 
1.58 

1.88 
1.88 

Gold,  Lead,  Silver, 

1,   2 

72 

14 

16 

2.37 

1.43 

1.79 

2.28 

Zinc,  Copper 
(HYDRO  W/GOLD) 

1,  3 

2,  3 

72 

14 

526 
526 

16 
16 

1.23 

2.57 

4.75 
2.13 

1.67 
1.67 

2.03 
2.03 

GOLD 

It  2 

29 

II 

12 

2.07 

0.98 

2.13 

2.93 

1,   3 

29 

572 

12 

1.45 

3.27 

1.78 

2.22 

2,  3 

II 

572 

12 

2.14 

1.89 

1.78 

2.22 

TUNGSTEN 

l>   2 

19 

2 

6 

4.35 

0.97 

2.85 

4.46 

1,   3 

19 

591 

6 

2.11 

6.42 

2.12 

2.83 

2,  3 

2 

591 

6 

4.30 

1.42 

2.12 

2.83 

(a)  The  F-test  guarantees  a  certain  confidence  level  when  F°  exceeds  the  value  of  F 
associated  with  that  level. 

(b)  Discriminators  were  chosen  by  maximizing  D^. 
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parameters.  Since  this  assumption  is  not  always  true  for  the  cases  in  question,  these 
approaches  were  not  used. 

5.6.2  Apparent  Error  Rate 

An  obvious  method  for  estimating  the  discriminant  error  rates  for  a  group  of  cell 
populations  is  to  compute  the  rates  at  which  the  discriminating  scheme  misclassifies  the 
training  cells,  whose  population  memberships  are  known.  These  rates  are  called  the 
"apparent  error  rates"  and  are  not  dependent  on  assumptions  of  normal  distribution. 

Specifically,  suppose  the  sample  set  consists  of  N  cells,  nj  of  which  are  in  population  i.  If 
the  discriminant  function  Y  misclassifies  rrij  of  population  i,  i.e.,  assigns  mj  of  these  cells 
to  categories  other  than  i,  then  the  apparent  error  rate  is: 

m,    +  rru    +    .     .     .    +  m 
T'J 2—fi E 

where  p  is  the  number  of  categories.  (In  our  case,  p  =  2  or  3.)  The  apparent  error  rate  is. 
biased—it  generally  gives  lower-than-realistic  error-rate  estimates.  This  is  primarily  due 
to  the  fact  that  the  apparent  error  rate  is  the  rate  at  which  the  discriminant  function 
misclassifies  precisely  the  same  cells  for  which  the  discriminant  function  was  chosen 
optimally  to  classify.  It  is  reasonable  to  expect  more  accurate  classification  of  these 
cells  than  of  all  cells  being  evaluated.  Results  of  the  apparent  error-rate  calculations 
are  in  Table  C-7. 

5.6.3  Lachenbruch  and  Mickey  Error  Function 

Lachenbruch  and  Mickey  (80,  81)  found  that  erf  (-D$/2),  where 

n.    +  n2  -  k  -   3 

D  2  =       - ^ =—        D2 

s  ni   +  n2  -  2 

and  erf  is  the  error  function, 


erf  (n)  =  -     C    e""'  dn 
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Table  C-7 

ESTIMATED  ERROR  RATES  FOR  TWO-CATEGORY  DISCRIMINATIONS 

(Fraction  of  cells  estimated  to  be  incorrectly  classified) 


Run  Name 

Apparent 
Error  Rate 

erf  (-  Ds/2) 

GOLD  NO  TRANSF(a) 

0.16 

0.28 

GOLD 

0.19 

0.28 

HYDRO  W/GOLD 

0.21 

0.29 

HYDRO  W/O  GOLD 

0.15 

0.27 

HYDWOGP 

0.07 

0.23 

IRON  -  MANGANESE 

0.09 

0.23 

COPPER 

0.09 

0.25 

TUNGSTEN 

0.08 

0.24 

(a)       The  discriminators  were  not  transformed  to  approach  normality. 
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gives  a  more  accurate  estimate  of  error  rates  provided  the  assumption  of  normality  is 
not  gratuitous. 

Table  C-7  lists  the  apparent  error  rates  and  erf  (D  /2)  estimates  for  the  cases  in  which 
discriminant  functions  were  computed  for  two  categories.  As  expected,  the  apparent 
error  rates  are  consistently  lower  than  those  computed  from  erf  (-D  /2).  In  most  cases, 
the  values  of  the  discriminators  have  been  transformed  so  that  the  distributions  of  their 
images  are  more  closely  normal.  Extensive  controls  were  not  run  to  guarantee  the 
validity  of  the  normality  assumption. 

5.6.4  Partial  Apparent  Error  Rates 

The_[  partial  apparent  error  rate,  Pj,  is  the  fraction  of  sample  category  i  misclassified 
as  belonging  to  some  other  category.  For  example,  say  (x|,  X2,  .  .  .,  xn)  are  the  n 
members  of  the  training  set  initially  classified  in  the  occurrence  category  and 
(yi,  y2»  .  .  .,  ym)  are  the  m  members  of  the  training  set  initially  assigned  to  the  no  known 
occurrence  category.  Further  suppose  that  the  discriminant  function  assigns  n|  members 
from  (xi,  xn>  •  •  .,  x_)  to  the  no  known  occurrence  category  (misclassified)  and  m> 
numbers  of  the  (yi,  y^  .  .  .,  ym)  to  the  occurrence  category  (also  misclassifications). 
then  the  first  partial  apparent  error  rates  are: 


P,    = ,    and 

I  n       ' 

m, 
P- 


2    " 


m 


Applying  this  technique  to  the  CDCA  runs  indicates  that,  in  general,  a  greater  fraction 
of  the  sample  cells  in  category  I  (occurrence)  is  misclassified  as  belonging  to  category  2 
(no  known  occurrence)  than  vice  versa.  Two  possible  explanations,  among  many,  for  this 
discrepancy  are: 
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U  Because  category  I  is  defined  as  containing  cells  having  any 
occurrence  of  the  specified  resource,  the  discriminant  function  may 
classify  cells  with  geologically  erratic  occurrences  of  the  mineral  as 
members  of  the  no  known  occurrence  category. 

2.  The  partial  apparent  error  rate  of  a  category  may  depend  inversely  on 
the  proportion  of  that  category. 

The  first  possibility  was  explored  in  the  run  HYDWOGP.  In  this  run,  one  category  is 
defined  as  containing  cells  in  which  the  specified  minerals  have  been  produced  in 
commercial  quantities  and  the  other  category  is  comprised  of  cells  with  no  known 
commercial  production  of  the  minerals.  Comparison  between  this  run  and  HYDRO  W/0 
GOLD  (same  minerals  with  the  usual  definitions  of  categories)  shows  that  the  first 
partial  apparent  error  rate  in  HYDWOGP  is  44.0  percent,  whereas  that  in  HYDRO  W/O 
GOLD  is  30.4  percent.  This  hardly  suggests  that  geological  variations  are  the  major 
source  of  misclassification  in  category  I;  including  certain  possibly  erratic  occurrences 
gives  less  partial  apparent  error  than  excluding  them  in  this  example. 

The  second  possibility  is  extremely  difficult  to  analyze  without  making  some  assumptions 
concerning  the  distributions  of  the  geologic  variable.  Qualitatively,  it  is  reasonable  to 
expect  an  increase  in  the  first  partial  error  rate  with  decreasing  n.  for  a  fixed  overall 
number  of  training  cells.   Table  C-8  shows  values  of  Pi  for  the  two-population  runs. 

5.7     CASES  SELECTED 

Two  general  conclusions  are  drawn  in  the  discussion  above.  First,  the  three-category 
runs  do  not  reliably  differentiate  among  the  three  distinct  categories.  The  assumption 
that  the  low-commercial-value  and  high-commercial-value  cells  comprise  separate 
categories  distinguished  by  our  discriminators  cannot  be  made  with  confidence.  It  should 
be  noted,  though,  that  some  results  of  the  three-category  run  on  HYDRO  W/O  GOLD 
(lead,  silver,  zinc  and  copper)  are  retained  in  this  study.  In  that  case,  except  for  certain 
cells  where  the  probability  of  having  the  predicted  category  membership  is  very  high,  the 
two  occurrence  categories  are  treated  as  one.  Specifically,  only  where  the  probability  of 
membership  in  either  occurrence  category  is  at  least  .9  are  the  two  occurrence  levels 
distinguished.   None  of  the  other  three-category  runs  is  used. 
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Table  C-8 
FIRST  PARTIAL  APPARENT  ERROR  RATE  P.  FOR  TWO-CATEGORY  RUNS<a)'(b) 


Number  n  i  of 

Cells  In 

Name  of  Run 

Category  1 

Pl 

GOLD 

40 

.375 

GOLD  -  MINRESID 

40 

.375 

GOLD  NO  TRANSF 

40 

.375 

HYDRO  VV/GOLD 

86 

.326 

HYDRO  W/O  GOLD 

56 

.304 

IRON  -  MANGANESE 

13 

.385 

COPPER 

25 

.480 

TUNGSTEN 

21 

.429 

(a)  Pi    is  the  fraction  of  training  cells  in  category    I    predicted  by  the 
discriminant  analysis  to  be  in  category  2. 

(b)  HYDWOGP  is  not  included  in  this  table. 
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The  other  general  conclusion  Is  that  the  two-category  runs  all  have  undesirably  high  error 
rates  for  the  occurrence  category.  For  this  reason,  decisions  should  not  be  made  that 
rely  heavily  on  results  indicating  no  mineral  occurrence  in  a  set  of  cells;  the  discriminant 
functions  misclassify  a  significant  fraction  of  cells  where  the  minerals  are  known  to 
occur. 

Observing  these  warnings,  two-category  runs  may  be  usable  from  a  statistical  point  of 
view.   Minor  practical  considerations,  however,  suggest  excluding  some  of  the  runs: 

•  GOLD    MINRESID    is    excluded    because    its   reliance   on   a   different 

?        I 
optimization  scheme,  minimizing     (I  +  D  /4)    ,  makes  essentially  no 

difference  in  the  results. 

•  GOLD  -  NO  TRANSF  is  excluded  because,  to  the  extent  that 
transformations  "normalize"  the  geological  variables  in  GOLD,  error 
rates  are  more  accurately  measured. 

•  TUNGSTEN  is  excluded  because  the  partial  apparent  error  rate  on  the 
occurrence  category  is  above  0.4.  Thus  in  this  case,  discriminant 
analysis  is  hardly  better  than  guesswork  (partial  apparent  error  rate  of 
0.5). 

•  COPPER  was  not  used  because  copper  was  included  in  HYRO  W/GOLD 
and  the  partial  apparent  error  rate  for  COPPER  alone,  in  the 
occurrence  category,  is  nearly  0.5 

5.8      RESULTS 

Three  cases  of  discriminant  function  analysis  were  selected  as  containing  potentially 
useful  results.  For  visual  presentation,  maps  showing  the  results  of  the  of  DFA 
predictions  and  reported  occurrences  for  the  three  cases  are  contained  herein.  The  three 
cases  are: 

•  Gold 

•  Silver,  Lead,  Copper  and  Zinc  combined 

•  Iron  and  Manganese  combined. 
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5.8. 1  Statisticol   Interpretation 

DFA  calculates  from  data  in  the  training  set  one  function  (two  functions  when  three 
categories  are  considered)  and  two  group  means  (three  when  three  categories  are 
considered)  corresponding  to  each  of  the  pre-defined  production  categories.  This 
function  (or  functions)  is  then  applied  to  all  cells  in  the  area,  the  scores  are  compared  to 
the  value  of  the  group  means  and  the  cells  are  assigned  to  the  group  whose  mean  their 
function  score  most  closely  matches.  In  addition,  there  is  a  probability  attached  to  each 
cell  which  is  a  measure  of  how  close  that  cell's  function  score  is  to  the  group  mean.  A 
probability  of  100  percent  indicates  that  the  score  is  identical  to  the  mean.  A  90  percent 
probability  indicates  that  the  score  is  very  close  the  the  group  mean,  but  not  exactly  the 
same.  A  50  percent  probability  says  the  score  is  exactly  between  the  two  group  means. 
The  assigned  probability  is  the  probability  of  correct  classification.  The  distinction 
between  it  and  probability  of  occurrence  is  that  the  former  measures  how  close  a  score 
matches  a  calculated  mean,  while  to  determine  the  latter,  one  must,  in  addition  ,  know 
how  close  the  group  mean  corresponds  to  the  real  geologic  environment  of  the 
corresponding  production  category.  Therefore,  a  key  assumption  in  the  application  of 
DFA  is  that  the  discriminant  function  does,  in  fact,  correspond  mathematically  to  the 
geologic  factors  affecting  mineralization.  The  following  section  presents  a  geologic 
interpretation   of  the  DFA  results. 

While  the  DFA  results  are  useful  for  a  "first  cut"  classification  of  mineral  potential, 
there  are  sources  of  uncertainty.  The  fact  that  a  particular  cell  in  the  training  set 
contains  no  reported  occurrence  of  gold  does  not  establish  that  there  are  absolutely  no 
gold  occurrences  in  it.  Indeed,  gold  occurrences  may  be  present  which  are  unknown,  or 
there  may  be  occurrences  which  are  known  but  not  reported.  Nevertheless,  the  lack  of 
reported  occurrences  defines  this  particular  cell  as  a  "non-occurrence"  cell  in  the 
training  set.  In  fact,  any  cell  that  was  either  initially  defined  (in  the  training  set)  as  a 
"non-occurrence"  cell,  or  was  subsequently  classified  by  DFA  as  a  "non-occurrence"  cell, 
has  some  likelihood  of  containing  one  or  more  gold  occurrences,  especially  considering 
the  widespread  occurrences  of  gold  in  trace  quantities  in  most  rocks  and  sediments. 
Similarly,  there  is  uncertainty  concerning  a  cell  which  is  initially  defined  or  subsequently 
classified  as  an  "occurrence"  cell.  Some  of  the  reported  occurrences  may  not  be  of 
economic  importance  in  any  sense  and  may  have  yielded  little  more  than  traces  of  gold. 
In  addition,  some  reports  of  the  presence  of  gold  may  be  in  error. 
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A  comparison  of  the  production  localities  for  each  reported  actual  occurrence  with  the 
DFA  results  indicates  several  "misclassifications".  Some  cells  that  have  known 
occurrences  are  classified  in  the  "non-occurrence"  category.  This  is  one  measure  of  the 
statistical  error  when  applying  the  DFA  procedure.  A  discussion  of  "misclassification" 
for  each  commodity  category  is  contained  in  the  appropriate  subsection  below. 

The  probability  estimates  of  correct  classification  pertain  to  each  4  km  by  4  km  cell  as  a 
whole  and  not  to  a  point  or  points  within  the  cell.  Comparison  with  the  geologic  map 
may  suggest  that  only  part  of  the  cell  has  any  actual  potential  for  occurrence.  Thus,  for 
appraising  a  particular  cell,  the  DFA  results  must  be  analyzed  in  the  light  of  the  geology 
in  that  cell. 

The  main  source  of  geologic  data  was  the  1:250,000  scale  Geologic  Map  of  California. 
This  map,  published  in  I  by  2  quadrangles,  is  a  compilation  of  other  geologic  maps 
prepared  at  different  times,  at  different  scales  and  by  different  persons  with  different 
objectives,  interests  and  perceptions.  More  detailed  geologic  data  might  improve  the 
reliability  of  the  DFA  results.  Examples  of  such  data  are  the  presence  of  gossans;  other 
evidence  of  alteration  associated  with  ore  deposits;  and,  the  presence  of  carbonates, 
especially  where  they  have  been  invaded  by  granitic  intrusives.  However,  there  is  only 
scant  information  concerning  lithologic  details  of  the  sedimentary  sequences  involved  in 
the  1:250,000  scale  maps. 

LANDSAT  lineament  data  would  very  likely  cause  some  measureable  improvement  in  the 
DFA  results  (i.e.,  a  reduction  in  the  number  of  misclassifications).  This  results  because 
it  is  probable  that  there  is  a  genetic  relationship  between  some  of  the  lineaments  and  the 
occurrence  of  ore  deposits.  The  degree  to  which  the  DFA  results  would  be  improved 
cannot  be  accurately  forecast,  but  the  effort  involved  in  incorporating  the  LANDSAT 
lineament  data  into  the  DFA  study  would  be  modest. 

Incorporation  of  numerically  encoded  LANDSAT  imagery  itself  into  the  DFA  study  would 
probably  yield  a  substantial  improvement.  The  reason  for  this  is  that  the  imagery  (quite 
apart  from  the  lineament  analysis)  probably  incorporates  the  effects  of  a  variety  of 
processes  related  to  ore  deposition,  including  large-scale  hydrothermal  rock  alteration 
effects,  and  gossans  or  other  weathering  and  near-surface  phenomena. 
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Gold 


The  results  for  gold  (see  the  enclosed  map)  indicate  that  only  three  geologic  variables 
provide  a  significant  contribution  to  the  discrimination  process.  These  are  the  areal 
proportion  of  Precambrian  metamorphics  (Variable  2);  the  length  of  contact  between 
Precambrian  granitic  rocks  and  Precambrian  metamorphics  (Variable  19);  and  the  areal 
proportion  of  Mesozoic  granitic  intrusives  or  pre-Cenozoic  granitic  and  metamorphic 
rocks  (Variable  I  I).  The  variables  contributing  to  the  discrimination  process  are  shown  in 
Table  C-9  with  the  most  effective  listed  first  and  the  remaining  variables  in  order  of 
decreasing  effects  in  the  discrimination  process. 

A  comparison  of  the  correct  classifications  versus  misclassifications  in  the  training  set  is 
presented  in  Table  C-IO.  Of  the  40  cells  in  the  training  set  which  were  defined  a  priori 
as  "occurrence"  cells,  25  (62.5  percent)  were  correctly  classified  by  DFA,  and  15  (37.5 
percent)  were  incorrectly  classified.  Of  the  572  cells  in  the  training  set  defined  a  priori 
as  "non-occurrence"  cells,  471  (82.3  percent)  were  correctly  classified  by  DFA,  and  101 
(17.7  percent)  were  misclassified. 

As  shown  in  Table  C-ll,  only  about  5  percent  of  all  reported  gold  occurrences  are  in 
predicted  low  probability  cells.  This  measure  may  be  slightly  biased  since  the  low 
probability  areas  may  have  not  been  as  extensively  explored  as  the  high  probability  areas. 
Nevertheless,  the  results  offer  strong  support  for  using  DFA  as  an  indicator  of  where 
gold  mineralization  is  not  likely  to  occur.  One  interesting  aspect  shown  in  Table  C-ll  is 
that  the  two  higher  production  categories  have  a  significantly  larger  proportion  of  their 
occurrences  misclassified.  Two  possible  explanations  for  this  are:  (I),  that  the 
discriminant  function  is  tuned  to  the  geologic  environments  of  the  low  value  occurrences 
because  they  dominate  in  number  and,  therefore,  carry  more  weight  in  the  derivation  of 
the  discriminant  function  and;  (2),  that  high  value  placer  deposits,  for  which  no 
consistent  geologic  environment  can  be  defined,  are  located  in  the  DFA  predicted  low 
likelihood  occurrence  cells.  If  placer  deposits  were  eliminated  from  this  analysis,  the 
number  of  misclassifications  would  necessarily  decrease. 
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Table  C-9 

DFA  RESULTS  FOR  GOLDa 

DFA  VARIABLES 


Number 

Variable  Name 

F  Valueb 

2 

19 
II 
13 
37 
41 
39 
10 
20 
14 

Precambrian  metamorphics 

Contact  of  Precambrian  granite  with  Precambrian  metamorphics 

Mesozoic  granite  and  Pre-Cenozoic  granite  and  metamorphics 

Tertiary  sediments 

Length  of  non-thrust  faults 

Bouguer  gravity 

Number  of  fault  intersections 

Mesozoic  basic  intrusives 

Contact  of  Mesozoic  granite  with  Paleozoic  sedimentary  rocks 

Tertiary  intrusives 

22.7 

18.1 
16.3 
4.5 
3.3 
3.2 
2.8 
2.7 
2.4 
2.1 

a     Geological    variables   are   ranked    in   decreasing   order  of  their   contribution  to  the 
discrimination  process. 

F  Value  is  a  measure  of  the  relative  contribution  of  the  variable  to  the  discriminant 
function  (77). 
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Table  C- 10 

DFA  RESULTS  FOR  GOLD 

Training  Cells  Correctly  and  Incorrectly  Classified 


Actual 

Correctly 

Classified 

By  DFA 

Incorrectly 
Classified 
By  DFA 

Occurrence 

No  Known  Occurrence 

40 

572 

25    (62.5%) 
47  1     (82.3%) 

15    (37.5%) 
101     (17.7%) 

Total 

612 

496    (81  .0%) 

116    (19.0%) 
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Table  C- 1 1 

DFA  RESULTS  FOR  GOLD 

Known  Deposits  In  Low  Probability  Cells 


Production 

Number  In  Low 

Category  Of 

Number  In 

Probability  Cell 

Known  Deposit 

CDCA 

(Percentage  Of  Total) 

0 

166 

3    (     1  .  896) 

1 

400 

15    (    3.7%) 

2 

172 

13    (    7.6%) 

3 

46 

7    (15.2%) 

4 

22 

5    (22.7%) 

TOTAL 

806 

43    (    5.3%) 
__     

Cells  classified  as  10  percent  or  less  of  probability  occurrence. 


Production  Categories: 

0  =  Occurrence 

1  =  Workings,  but  no  production 

2  =  Production  under  $50,000 

3  =  Production  between  $50,000  and  $500,000 

4  =  Production  over  $500,000 
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Copper-Lead-Zinc-Silver  DFA  Results 

Results  for  combined  copper,  lead,  zinc  and  silver  (see  enclosed  map)  indicate  four  or 
possibly  five  variables  provide  significant  contribution  to  the  discriminant  process.  The 
geological  variables  that  were  employed  in  the  DFA  are  listed  in  Table  C- 12  with  the 
variables  ranked  in  order  of  decreasing  contribution.  The  first  five  variables,  Ordovician 
through  Mississipian  sediments  (Variable  A),  contact  of  Tertiary  igneous  intrusives  with 
Mesozoic  granitic  intrusives  (32),  Precambrian  metamorphic  rocks  (2),  Precambrian 
granitic  rocks  (I),  and  Tertiary  igneous  intrusives  (14),  contributed  most  to  the  DFA's 
discriminatory  power.  The  various  fault  relationships,  including  curvature  of  faults, 
contribute  very  little  to  the  discrimination  process. 

The  results  show  a  number  of  misclassifications  as  summarized  in  Tables  C-13  and  C-14. 
A  cell  that  is  classified  in  the  low  probability  of  occurrence  category. but  which  has  one 
or  more  known  production  occurrences,  is  misclassified  (assuming  the  report  of  actual 
occurrence  is  correct)  and  must  be  regarded  as  having  mineral  potential  regardless  of  the 
DFA  results.  A  cell  which  receives  favorable  DFA  classification,  say  90  percent 
probability  of  correct  classification  in  the  category  consisting  of  production  of  $50,000 
or  more,  but  which  lacks  any  reported  actual  production  or  any  known  occurrences,  is 
also  favorable.  But,  the  degree  of  certainty  that  one  or  more  ore  deposits  are  present  in 
such  a  situation  is  less  than  in  those  cells  in  which  there  is  absolute  certainty  that  a 
deposit  is  present  (i.e.,  where  there  has  been  a  producing  mine). 

As  shown  in  Table  C-14,  only  about  6  percent  of  all  occurrences  of  copper,  lead,  silver 
and  zinc  are  in  cells  assigned  a  low  probability  of  occurrence.  The  proportion  of 
misclassifications  is  relatively  constant  across  production  categories.  This  result  may  be 
slightly  biased  since  the  low  probability  areas  may  not  have  been  as  extensively  explored 
as  high  probability  areas.  Nevertheless,  these  results  provide  strong  support  for  the  use 
of  DFA  as  an  indicator  of  where  copper,  lead,  silver  and  zinc  mineralization  is  not  likely 
to  occur. 
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Table  C- 1 2 
DFA  RESULTS  FOR  COMBINED  COPPER-LEAD-ZINC-SILVERa 

DFA  VARIABLES 


Number 


Variable  Name 


F  Valueb 


4 

32 

2 

I 
14 
20 
5 
3 
II 
41 
19 

40 
39 
36 


Ordovician  through  Mississippian  marine  sediments 

Contact  Tertiary  intrusives  (14)  with  Mesozoic  granite  (I  I) 

Precambrian  metamorphics 

Precambrian  granite 

Tertiary  intrusives 

Contact  Mesozoic  granite  (I  I)  with  Paleozoic  sediments  (4  and  5) 

Pennsylvanian  and  Permian  marine  sediments 

Cambrian  and  Precambrian  sediments 

Mesozoic  granite  and  pre-Cenozoic  granite  and  metamorphics 

Bouguer  gravity 

Contact  Precambrian  granite  (I)  with  Precambrian  meta- 
morphics (2) 

Curvature  of  faults 

Number  of  fault  intersections 

Number  of  thrust  faults 


31.2 

14.7 

14.1 

13.3 

9.4 

6.6 

5.6 

4.7 

4.2 

3.6 

3.3 
2.9 

2.7 
1.9 


Geological    variables   are   ranked    in  decreasing   order   of   their   contribution  to  the 
discrimination  process. 

F  Value  is  a  measure  of  the  relative  contribution  of  the  variable  to  the  discriminant 
function  (77). 
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Table  C- 1 3 

DFA  RESULTS  FOR  COMBINED  LEAD,  SILVER,  ZINC  AND  COPPER 

Training  Cells  Correctly  and  Incorrectly  Classified 


Actual 

Correctly 

Classified 

By  DFA 

Incorrectly 
Classified 
By  DFA 

Production  of  $50,000  or  more 

Occurrence,  but  production  less  than 
$50,000 

No  Reported  Occurrence 

4 
52 

556 

3    (75.0%) 
32    (61 .5%) 

477    (85.8%) 

1     (25.096) 
20    (38.5%) 

79    (14.2%) 

Total 

612 

512    (83.7%) 

100    (16.3%) 
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Table  C- 1 4 
DFA  RESULTS  FOR  COMBINED  LEAD,  SILVER,  ZINC,  AND  COPPER 
Known  Deposits  In  Low  Probability  Cells 


Production 
Category  Of 
Known  Deposit 

Number  In 
CDCA 

Number  In  Low 
Probability  Cell 
(Percentage  Of  Total) 

0 

160 

II     (    6.9%) 

1 

280 

18    (    6.4%) 

2 

148 

8    (    5.4%) 

3 

30 

1     (    3.3%) 

4 

9 

0    (    0.0%) 

TOTAL 

627 

38    (    6.  1%) 

Cells  classified  as  10  percent  or  less  probability  of  occurrence. 


Production  Categories; 

0  =  Occurrence 

1  =  Workings,  but  no  production 

2  =  Production  under  $50,000 

3  =  Production  between  $50,000  and  $500,000 

4  =  Production  over  $500,000 
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Iron  and  Manganese 

The  DFA  results  for  Iron  and  manganese  (see  the  enclosed  map)  show  that  most 
influential  geological  variables  are  the  areal  proportion  of  Tertiary  igneous  intrusives  and 
the  areal  proportion  of  Precambrian  metamorphics.  Contact  relationships  involving 
Tertiary  igneous  intrusives  with  Tertiary  sediments  and  with  Mesozoic  granitic  intrusives 
follow  in  third  and  fourth  place.  The  geologic  variables  that  were  employed  in  the  DFA 
are  listed  in  Table  C- 15. 

The  comparison  of  correct  classification  versus  misclassification  of  the  training  cells  is 
presented  in  Table  C- 16.  The  proportion  of  training  set  cells  defined  a  priori  to  be  in  the 
"occurrence"  category  and  which  were  misclassified  is  relatively  high  (38.5  percent).  The 
misclassification  proportion  of  those  training-set  cells  classified  a  priori  as  "non- 
occurrence" cells  is  only  8.5  percent. 

As  shown  in  Table  C- 17,  about  50  percent  of  iron  and  manganese  occurrences  fall  in  low 
probability  areas.  This  indicates  that  the  DFA  results  should  not  be  used  to  classify 
areas  as  having  low  potential  for  iron  and  manganese. 

Overall,  the  DFA  results  for  iron  and  manganese  show  weak  statistical  relationships  and 
the  results  should  be  used  with  caution. 

5.8.2  Interpretation  of  the  Geology 

In  interpreting  the  geologic  meaning  of  the  DFA  results,  it  is  important  to  stress  that 
DFA  yields  statistical  associations  but  not  geologic  reasons  for  the  associations.  The 
statistical  relationships  do  not  necessarily  connote  a  cause  and  effect  relationship. 

Gold 

As  Table  C-9  reveals,  gold  occurrences  in  the  CDCA  are  statistically  linked  with  the 
presence  of  Precambrian  metamorphics,  contacts  between  Precambrian  metamorphics 
and  Precambrian  granite,  and  Mesozoic  granitic  intrusives.  In  Precambrian 
metamorphics,  the  gold  may  occur  in  hydrothermal  deposits  that  are  related,  directly  or 
indirectly,  to  the  presence  of  granitic  intrusives,  either  of  Precambrian  or  Mesozoic  age. 
While  this  association  is  not  surprising,  it  is  moderately  surprising  that  the  DFA  results 
are  so  little  influenced  by  the  presence  of  Tertiary  igneous  intrusives. 
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Table  C- 1 5 

DFA  RESULTS  FOR  IRON  AND  MANGANESE0 

DFA  VARIABLES 


Number 

Variable  Name 

F  Valueb 

14 
2 

34 
32 

15 

Tertiary  igneous  intrusives 

Precambrian  metamorphics 

Contact  of  Tertiary  sediments  and  Tertiary  igneous  intrusives 

Contact  between  Tertiary  igneous  intrusives  and  Mesozoic 
granitic  intrusives 

Tertiary  volcanics 

25.3 

13.1 

8.5 

5.6 
3.3 

Geological   variables   are   ranked    in   decreasing   order   of  their   contribution   to  the 
discrimination  process. 

F  Value  is  a  measure  of  the  relative  contribution  of  the  variable  to  the  discriminant 
function  (77). 
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Table  C- 1 6 

DFA  RESULTS  FOR  IRON  AND  MANGANESE 

Training  Cells  Correctly  and  Incorrectly  Classified 


Actual 

Correctly 

Classified 

By  DFA 

Incorrectly 
Classified 
By  DFA 

Occurrence 

No  Known  Occurrence 

13 

599 

8    (61 .5%) 
548    (91 .5%) 

5    (38.5%) 
51    (    8.5%) 

Total 

612 

556    (90.8%) 

56    (    9.2%) 
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Table  C- 1 7 

DFA  RESULTS  FOR  IRON  AND  MANGANESE 

Known  Deposits  In  Low  Probability  Cells 


Production 
Category  Of 
Known  Deposit 

Number  In 
CDCA 

Number  In  Low 

Probability  Cell 

(Percentage  Of  Total) 

0 

1 

2 
3 
4 

55 

76 

40 

4 

3 

30    (54.5%) 

33    (43.4%) 

19    (47.5%) 

4(100.0%) 

1    (33.3%) 

TOTAL 

178 

87    (48.9%) 

Cells  classified  as  10  percent  or  less  probability  of  occurrence. 


Production  Categories: 

0  =  Occurrence 

1  =  Workings,  but  no  production 

2  =  Production  under  $50,000 

3  =  Production  between  $50,000  and  $500,000 

4  =  Production  over  $500,000 
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Listed  below  for  each  of  the  variables  that  exerts  a  significant  degree  of  influence  is 
some  rationale  for  its  effect.  The  variables  are  discussed  in  decreasing  order  of 
influence. 

•  Precambrian  Metamorphics  (Geologic  Variable  2) 

The  statistical  association  of  this  lithologic  variable  with  gold  probably 
reflects  several  diverse  influences.  These  include;  I,  the  presence  of 
carbonates  (marbles)  as  reactive  host  rocks;  2,  the  presence  of  gold- 
bearing  quartz  veins  that  occur  within  schists;  and,  3,  the  presence  of 
disseminated  gold  incorporated  as  detrital  gold  in  meta-sedimentary 
rocks  at  the  time  of  sedimentation.  Some  of  this  originally  detrital 
gold  may  have  been  remobilized  by  hydrothermal  processes. 

•  Contact  of  Precambrian   Granite  with  Precambrian   Metamorphics 
(Variable  19) 

The  association  here  is  not  surprising.  Classical  theory  suggests  that 
hydrothermal  solutions  evaporating  from  granitic  intrusives  may  be  the 
source  of  some  of  the  gold  present  both  in  the  intrusives  and  in  the 
metamorphics.  Intrusive  contacts  have  long  been  regarded  as 
favorable  loci  for  hydrothermal  deposits.  Furthermore,  the 
hydrothermal  solutions  derived  from  the  granitic  intrusives  may  be 
responsible  for  remobilization  of  detrital  gold  and  other  forms  of 
disseminated  gold. 

•  Mesozoic  Granite  and  Pre-Cenozoic  Granites  and  Metamorphics 
(Variable  II) 

The  same  arguments  apply  here  as  above,  namely  that  acidic 
intrusives,  regardless  of  age,  are  accompanied  by  hydrothermal 
activity. 

Examples  of  the  efficacy  of  these  results  are  given  by  a  survey  of  the  geologic 
descriptions  of  gold  deposits  in  Inyo  and  San  Bernardino  Counties  (summarized  in  Table 
C- 18).  They  indicate  that  many  of  these  deposits  accord  to  some  degree  to  the  variables 
outlined  above.  Table  C-18  shows  examples  of  possible  relationships  between  the  geology 
and  the  DFA  results.  These  relationships  were  not  verified  by  field  checking.  Since  the 
statistical  analysis  was  based  on  maps  at  scale  1:250,000,  detailed  local  geology  cannot 
be  considered. 
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Table  C-18 

GEOLOGIC  SETTING  OF  SELECTED  GOLD  DEPOSITS 
IN  INYO  AND  SAN  BERNARDINO  COUNTIES  (1,  2)* 


NAME  OF  MINE  OR  PROSPECT 

GEOLOGIC  SETTING 

POSSIBLE 
STATISTICAL  ! 
VARIABLE 

Inyo  County 

Arando  Mine 

Quartz-bearing  shear  zone  in  granitic 
rocks 

11 

Ashford  Mine 

Veins  in  granite  gneiss 

2  &  11 

Burro  and  Mary  F.  Claims 

Quartz  veins  in  schist 

2 

Corona  Mine 

Quartz  veins,  contact  of  schist  with 
granite 

2  &  19 

Del  Norte  Group 

Fractured  quartzite 

2 

Independent  Mine 

Quartz  masses  in  Precambrian  dolorite 
adjacent  to  diorite  intrusive 

2,  19,  11 

Skidoo  Mine 

Breccia ted  diorite  sill  intrusive  into 
limestone  and  quartzite 

2,  19,  11 

Sunset  Mine 

Quartz  vein  in  quartz  monzonite  asso- 
ciated with  granitic  dike 

11 

San  Bernardino  County 

Alvord  Mine 

Quartz  vein  in  crystalline  limestone 
in  contact  with  granite 

2,  19,  11 

Brannigan  Mine 

Quartz  veins  in  Precambrian  quartzite 

2 

Oro  Fino  Mine 

Siliceous  shoots  in  Precambrian 
quartzite  schist  and  dolomite 

2 

Williams  Well  Placers 

"Placer"  mining  of  weathered  granite 
mantle 

13 

*  Only  representative  examples  of  each  setting  are  listed,  in  as  much  as  many  deposits 
occur  in  similar  geologic  settings. 
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Combined  Copper,  Zinc,  Lead  and  Silver 

The  geological  associations  of  copper,  zinc,  lead  and  silver  deposits  are  markedly 
different  from  gold.  The  DFA  results  show  a  close  statistical  affiliation  with  the 
presence  of  Ordovician  through  Mississipian  sedimentary  rocks;  contacts  of  Tertiary 
igneous  intrusives  with  Mesozoic  intrusives;  proportions  of  Precambrian  metamorphics, 
Precambrian  granitic  rocks,  and  Tertiary  intrusives.  The  association  with  the  Ordovician 
through  Mississipian  marine  sediments  probably  reflects  in  part,  the  fact  that  limestones 
and  dolomites  are  present  and  may  serve  as  host  rocks. 

Contacts  between  Tertiary  and  Mesozoic  granitic  intrusives  may  have  presented 
favorable  situations  because  of  the  derivation  of  ore-forming  hydrothermal  fluids  from 
the  Tertiary  intrusives,  with  both  the  Tertiary  and  the  Mesozoic  intrusives  serving  as 
host  rocks.  Precambrian  metamorphics  probably  serve  as  host  rocks  for  ore-forming 
fluids  derived  from  Precambrian  intrusives.  Thus,  overall  we  seem  to  detect  a  close 
relationship  between  igneous  intrusives  (of  different  ages)  and  host  rocks  which  include 
carbonate-bearing  Paleozoic  sediments,  metamorphosed  Precambrian  sediments,  and  the 
various  igneous  intrusives.  All  of  these  associations  are  compatible  with  classical  theory 
with  respect  to  the  origin  of  hydrothermal  deposits. 

The  possible  role  or  roles  that  the  principle  geologic  variables  may  have  played  in 
influencing  copper-lead-zinc-silver  deposits  are  described  below.  The  variables  are  listed 
in  decreasing  order  of  their  influence. 

•  Ordovician  Through  Mississipian   Marine  Sedimentary  Rocks 
(Variable  4) 

The  principle  influence  is  probably  the  presence  of  carbonates  which 
serve  as  host  rocks  for  hydrothermal  solutions. 

•  Contact  Of  Tertiary  Intrusives  With  Mesozoic  Granite   (Variable  32) 

Contact   relationships   involving   acidic   intrusive    rocks   appear  to  be 
important  ore-forming  influences  in  many  contexts  such  as  this  one. 

•  Precambrian  Metamorphic   (Variable  2) 

These  probably  serve  as  host  rocks,  particularly  since  carbonates  are 
widely  distributed  in  Precambrian  assemblages  within  the  CDCA. 
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•  Precambrion  Granite   (Variable  I) 

The  relationship  here  is  probably  partially  a  contact  relationship 
between  younger  Precambrian  granites  and  older  Precambrian 
metamorphics. 

•  Tertiary  Intrusives   (Variable  14) 

The  expanse  of  Tertiary  intrusives,  as  well  as  their  contact 
relationships,  exert  some  influence.  This  is  in  accord  with  classical 
ore-deposit  theory. 

As  examples  of  the  efficacy  of  these  results,  Table  C- 19  lists  some  of  the  copper,  zinc, 
lead,  and  silver  mines  and  prospects  in  Inyo  and  San  Bernardino  Counties  and  their 
geologic  settings.  Table  C- 19  shows  examples  of  the  relationship  between  the  geology 
and  the  DFA  results.  These  relationships  were  not  verified  by  field  checking.  Since  the 
statistical  analyses  were  based  on  maps  at  scale  1:250,000,  detailed  local  geology  cannot 
be  considered. 

Iron  and  Manganese 

The  DFA  results  for  iron  and  manganese  show  an  association  with  Tertiary  igneous 
intrusives  (Variable  14),  Precambrian  metamorphics  (Variable  2)  and  contacts  between 
Tertiary  igneous  intrusives  and  Tertiary  sediments  (Variable  34).  This  suggests  that 
contact  metamorphic  relationships  have  considerable  bearing.  In  fact,  some  of  the 
potential  iron  and  manganese  deposits  may  be  of  contact  metamorphic  origin.  The  iron 
and  manganese  deposit  in  the  Palo  Verde  Mountains  in  Imperial  County  are  in  the 
presence  of  Variables  2  and  14  as  are  the  minor  manganese  deposits  in  the  Randsburg 
District  in  San  Bernardino  County. 
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Table  C-19 

GEOLOGIC  SETTING  OF  SELECTED  COPPER,  LEAD,  SILVER  AND  ZINC 
MINES  AND  PROSPECTS  IN  INYO  AND  SAN  BERNARDINO  COUNTIES 


MINE  OR  PROSPECT 

COMMODITIES 

PRESENT  OR  PRODUCED 

GEOLOGIC  SETTING 

POSSIBLE 
STATISTICAL 

VARIABLE 

Inyo  County 

Sally  Ann  Mine 

Copper 

Contact  quartz  monzonite 
with  Paleozoic  metamorphics 

20 

Argenta  Mine 

Lead  -  zinc 

Contact  of  limestone  with 
schist  and  quartzite 

4  &  2 

Cerro  Gordo  Mine 

Lead,  zinc,  silver 

Devonian  quartzite  and 
marble 

4 

Darwin  District 

Lead,  silver,  zinc, 
copper 

Pennsylvanian  limestone, 
shale,  quartzite  intruded 
by  granodiorite 

20  &  5 

Empress  Mine 

Silver,  zinc,  copper 

Quartz  vein  in  granite, 
near  contact  with  limestone 

4  &  20 

Lippincott  Mine 

Lead,  zinc,  silver 

Siliceous  veins  in  dolomite 

4 

San  Bernardino  County 

Blue  Bell  Mine 

Lead,  silver,  copper 

Veins  in  limestone  near  in- 
trusive contact  with  granite 

4  &  20 

Gold  Hill  Group 

Silver,  lead 

Quartz  veins  in  brecciated 
schist  and  gneiss 

2 
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